{ "cells": [ { "cell_type": "markdown", "id": "56362576", "metadata": {}, "source": [ "# Homework 2\n", "\n", "In this homework we'll explore decision trees and overfitting, and learn about the right way to evaluate the performance of a classifier." ] }, { "cell_type": "code", "execution_count": 114, "id": "d5cb5b7d", "metadata": {}, "outputs": [], "source": [ "from sklearn import datasets\n", "from sklearn import tree\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.metrics import accuracy_score\n", "import numpy as np\n", "import random\n", "from sklearn.tree import export_text\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 115, "id": "4f705984", "metadata": {}, "outputs": [], "source": [ "def make_dataset(n, d = 4, p = 0):\n", " \"\"\"\n", " Create a dataset with boolean features and a binary class label.\n", " The label is assigned as x1 ^ x2 V x3 ^ x4.\n", " \n", " Arguments:\n", " n - The number of instances to generate\n", " m - The number of features per instance. Any features beyond the first four\n", " are irrelevant to determining the class label.\n", " p - The probability that the true class label as computed by the expression\n", " above is flipped. Said differently, this is the probability of class noise.\n", " \"\"\"\n", " \n", " assert d >= 4, 'The dataset must have at least 4 features'\n", " X = [np.random.randint(2, size = d) for _ in range(n)]\n", " y = [(x[0] and x[1]) or (x[2] and x[3]) for x in X]\n", " y = [v if random.random() >= p else (v + 1) % 2 for v in y]\n", " return X, y" ] }, { "cell_type": "markdown", "id": "e655d42d", "metadata": {}, "source": [ "When evaluating the accuracy of a classifier, the right way to do it is to have a test set of instances that were not used to train the classifier and measure on those instances. The [train_test_split()](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html) function in scikit makes it easy to create training and testing sets. Below is an example that shows overfitting as evidenced by higher accuracy on the training set than the testing set." ] }, { "cell_type": "code", "execution_count": 116, "id": "c9296ffb", "metadata": {}, "outputs": [], "source": [ "# Create a dataset with 1000 instances, each with 10 attributes, and 10% class noise\n", "X, y = make_dataset(1000, d = 10, p = 0.1)" ] }, { "cell_type": "code", "execution_count": 117, "id": "68be6e50", "metadata": {}, "outputs": [], "source": [ "# Make training and testing sets, each with half of the data\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, train_size=0.5)" ] }, { "cell_type": "code", "execution_count": 118, "id": "c9c91fa8", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Training accuracy: 0.96\n", "Testing accuracy: 0.78\n" ] } ], "source": [ "# Train the classifier and evaluate it on train/test splits\n", "clf = tree.DecisionTreeClassifier()\n", "clf.fit(X_train, y_train)\n", "print('Training accuracy: %.2f' % accuracy_score(y_train, clf.predict(X_train)))\n", "print('Testing accuracy: %.2f' % accuracy_score(y_test, clf.predict(X_test)))" ] }, { "cell_type": "markdown", "id": "4b80c59b", "metadata": {}, "source": [ "Note that if the training set has 0% class noise, we get a perfect tree. Spend some time convincing yourself that the tree below captures the boolean expression that assigns class labels." ] }, { "cell_type": "code", "execution_count": 119, "id": "119f8f27", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "|--- feature_3 <= 0.50\n", "| |--- feature_1 <= 0.50\n", "| | |--- class: 0\n", "| |--- feature_1 > 0.50\n", "| | |--- feature_0 <= 0.50\n", "| | | |--- class: 0\n", "| | |--- feature_0 > 0.50\n", "| | | |--- class: 1\n", "|--- feature_3 > 0.50\n", "| |--- feature_2 <= 0.50\n", "| | |--- feature_0 <= 0.50\n", "| | | |--- class: 0\n", "| | |--- feature_0 > 0.50\n", "| | | |--- feature_1 <= 0.50\n", "| | | | |--- class: 0\n", "| | | |--- feature_1 > 0.50\n", "| | | | |--- class: 1\n", "| |--- feature_2 > 0.50\n", "| | |--- class: 1\n", "\n" ] } ], "source": [ "X, y = make_dataset(1000, d = 10, p = 0.0)\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, train_size=0.5)\n", "clf = tree.DecisionTreeClassifier()\n", "clf.fit(X_train, y_train)\n", "print(export_text(clf))" ] }, { "cell_type": "markdown", "id": "f89da077", "metadata": {}, "source": [ "# Assignment\n", "\n", "Explore the impact of the following on the extent of overfitting:\n", "* The size of the dataset (n in the call to make_dataset)\n", "* The number of irrelevant features (d in the call to make_dataset)\n", "* The probability of class noise (p in the call to make_dataset)\n", "* The minimum number of samples required for a node to be split. That is the min_samples_split parameter to the [DecisionTreeClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier) constructor\n", "\n", "

\n", " \n", "Vary each of the parameters above and build learning curves for training and testing accuracy, plot them, and for each of the parameters write up an explanation for the impact the parameter has on overfitting. Also, in each case, display at least one decision tree and explain what is happening that is making it overfit." ] }, { "cell_type": "markdown", "id": "2c30ebe7", "metadata": {}, "source": [ "Here is an example of generating a learning curve for a fixed size dataset where the fraction of instances used for training is varied. You can use this template to create your own learning curves." ] }, { "cell_type": "code", "execution_count": 122, "id": "606a3a3e", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 122, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAD4CAYAAADiry33AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAhIklEQVR4nO3dfXBc1Znn8e8jqaXWu2TZsmVLwgYM2GBjm8YhQCWQBLDJJIQkSwjDZheScqgNqdRukQKyk8xms1NFzexmk2xIXCTxMqkMsJkwTsjgBIcEBibAYMmW8Qs2Fn6TLL/bkixb7zr7x72SWnJLupK71dL171N1q9X33NN6+lp+7ulzzj1tzjlERCS8MtIdgIiIpJYSvYhIyCnRi4iEnBK9iEjIKdGLiIRcVroDSGTmzJlu/vz56Q5DRGTaqK2tPeGcm5WobEom+vnz51NTU5PuMEREpg0zOzBSmbpuRERCToleRCTklOhFREJOiV5EJOSU6EVEQm7MRG9m68zsmJltH6HczOwHZlZvZu+Y2Yq4slVmttsveyyZgYuISDBBWvRPA6tGKV8NLPS3NcCPAcwsE3jSL18MfN7MFl9IsCIiMn5jJnrn3GvAqVEOuQv4ufO8BZSYWQWwEqh3zu11znUBz/nHpswP/riHf6xpoP5YG319Wn5ZRASSc8PUPKAh7nmjvy/R/g+M9CJmtgbvEwHV1dXjDqKrp491f95H87luAAqjWSyrKmF5VQnLqktYVlXKjPzscb+uiMh0l4xEbwn2uVH2J+Scewp4CiAWi427OZ6dlcHmv7qNvSfa2HywmbqGZrYcbOaHr9TT37i/pCzPS/xVJSyvLmVRRRHZWRqPFpFwS0aibwSq4p5XAk1A9gj7UyYjw7i8vJDLywu5J+b96rOdPWw71EJdQzN1B5t5c+9Jfl3nhZGdlcHVc4tYXlXKsmqv9V9ZmotZomuUiMj0lIxE/wLwsJk9h9c10+KcO2xmx4GFZrYAOATcC9yXhN83Lvk5WdxwaRk3XFo2sO9wSztbBlr9p3nm7QOs+/M+AGYWZLOsqpTl1V7Lf2llMYXRyGSHLSKSNGMmejN7FrgFmGlmjcBfAxEA59xaYANwJ1APnAMe8Mt6zOxh4CUgE1jnnNuRgvcwbhXFuVQsyeXOJRUAdPf2sfvIGbb4rf4tDad5+d2jAJjBwvKCwVZ/dQkLywvJzFCrX0SmB5uKXw4ei8VculevbDnXTV2jl/jrGk6zpaF5YKA3PzuTJZXFLK8u9fv7SygvjKY1XhG5uJlZrXMulqhsSi5TPBUU50X48BWz+PAV3vLOzjkOnDzHlobTfvJv5iev7aXHH+mdV5I70M+/rKqEa+YVE41kpvMtiIgASvSBmRnzZ+Yzf2Y+dy+vBKCju5cdTa1sOXh6YJbPi+8cBiArw1hUUTTQ17+8upT5ZXka6BWRSaeumyQ7fqZzYJC3rqGZrQ3NnO3qBaAkL8K1lSUDyX9ZVQkleZrbLyIXbrSuGyX6FOvtc9QfaxvS6n/v2Bn6T/ulM/PjunxKuaqikEim5vaLyPgo0U8xbZ09vNPYHDfFs5kTbZ0A5GRlsGResd/q96Z5VhRH1eUjIqNSop/inHMcam4fSPp1Dc1sO9RCV08fAOWFOQP9/P1z+/NzNLwiIoM062aKMzMqS/OoLM3jL5bOBby1e3YdaR1I/HUNzWzc6c3tzzC4YnYhy6tLB9byuXxWARma2y8iCahFP42cPttFXVyXT93B07R29ABQmJPF0qpi78YuP/nPLMhJc8QiMlnUog+J0vxsbr2ynFuvLAegr8+x7+TZgbt56xqa+fG/vE+vP7e/akau18/vJ/6r5xaRk6W5/SIXGyX6aSwjw7hsVgGXzSrgM9d5c/vbu3rZ3tQyMMundv8pfrvVX8QtM4NFc4tYXjU4xbN6hub2i4Sdum4uAkdbO9jS3+o/2Mw7jS20d3tz+7OzMphTFGVOcZQ5RVEqigd/nlMcpaI4l1mFOVrbR2SK06wbGaKnt4/3jrZR19DMgVNnOdLS4W2tHRxu6RiY7dMvM8OYVZDjJ/4os4ddECqKc5ldnKNuIZE0Uh+9DJGVmcHiuUUsnlt0XplzjtPnuv3E387hlg6OtngXgCOtHew51sa/7jnBmc6e8+qW5WefdxHo/1Qwx99XoGmhIpNO/+tkCDNjRn42M/KzE14I+p3p6OZoawdHWjo53NLOkZYODrcOXhS2NDRz6mzXefUKc7IGkn5/V9Fs/5PCnKJcKoqjlORFNG4gkkRK9DIhhdEIhdEIl5cXjnhMR3cvx1r9C0Gr1z102O8mOtzawZ6jJzh2poPh3+Oek5Ux5BPBnOIoFUVR5vifDCqKo8ws0LiBSFBK9JIy0Ugm1WV5VJfljXhMT28fx9s6B8YJ+ruI+p9vPniaoy2ddPWeP25QXpho3CDX/3QQpbxI4wYioEQvaZaVmeF941dx7ojHOOc4dbZr4NNA/KeDo60d7D5yhld3H+ecv0povJkFQ8cNKopzzxtH0HISEnb6C5cpz8woK8ihrCCHa+YVJzzGOceZzp7BgeO4WURHWtppPN1O7YHTnPa/JSxeYTRrxE8F/Z8YinM1biDTlxK9hIKZURSNUBSNsHD26OMGwz8VHIkbQ9h95AzH2zoZPus4JyuDiuIoBdEsDCX8oCKZRnFuhKLcCMX+VhT1H3OzBvYXRSMU50UoyM7Smk0poEQvF5VoJHPgm8JG0t3bx/EzncO6iryppom6h2RknT29nGjr4v3jZ2nt6Ka1vfu8wfd4GeYN9PdfCM6/OETiLg5x5f6jvsshMSV6kWEimRnMLcllbsnI4wYyMX19jrauHlrbu2lp76a1vcd/7Ka1w9vX/7ylvZvWjh6OtrYN7OscdjPfcHnZmUM+MQxcHKKR8y4KRdEsivMGLyR52Zmh7Z5ToheRSZORMdjFVlk6/vod3b1DLgr9F4r4i0NLXHlTcwfvHj5Da3t3wpv84mVl2JBPDYk+MRSfd9HwjimMRqb0dF8lehGZNqKRTKKRTMqLouOu29vnODPsApHoU0T/J4mW9m4OnW4f2NczWp8T3s2Ag11LWed1OQ2/OPSXF+VGiEZSOw1YiV5ELgqZGUZJXjYlednjruuco727N+6i0DPip4hWv3z/iXMD+8Ya28nOyqA4N0JlaS7r/9NNE32LI1KiFxEZg5mRl51FXnbWqPd8jKSrp2/g00T8J4bWuE8TrR3dZGWkZjBZiV5EJMWyszIG7gVJB81FEhEJOSV6EZGQU6IXEQk5JXoRkZBTohcRCTklehGRkAuU6M1slZntNrN6M3ssQXmpma03s3fM7G0zuyaubL+ZbTOzOjPTN36LiEyyMefRm1km8CRwG9AIbDKzF5xzO+MO+wZQ55y728yu8o//aFz5rc65E0mMW0REAgrSol8J1Dvn9jrnuoDngLuGHbMY+COAc24XMN/MZic1UhERmZAgiX4e0BD3vNHfF28r8GkAM1sJXAJU+mUO2GhmtWa2ZqRfYmZrzKzGzGqOHz8eNH4RERlDkESfaO3N4cu4PQGUmlkd8FVgC9C/JuhNzrkVwGrgK2b2oUS/xDn3lHMu5pyLzZo1K1DwIiIytiBr3TQCVXHPK4Gm+AOcc63AAwDmrdy/z99wzjX5j8fMbD1eV9BrFxy5iIgEEqRFvwlYaGYLzCwbuBd4If4AMyvxywC+BLzmnGs1s3wzK/SPyQduB7YnL3wRERnLmC1651yPmT0MvARkAuucczvM7CG/fC2wCPi5mfUCO4Ev+tVnA+v9r+fKAp5xzv0++W9DRERGYm74191PAbFYzNXUaMq9iEhQZlbrnIslKtOdsSIiIadELyISckr0IiIhp0QvIhJySvQiIiGnRC8iEnJK9CIiIadELyISckr0IiIhp0QvIhJySvQiIiGnRC8iEnJK9CIiIadELyISckr0IiIhp0QvIhJySvQiIiGnRC8iEnJK9CIiIadELyISckr0IiIhp0QvIhJySvQiIiGnRC8iEnJK9CIiIadELyISckr0IiIhp0QvIhJySvQiIiGnRC8iEnJK9CIiIadELyIScoESvZmtMrPdZlZvZo8lKC81s/Vm9o6ZvW1m1wStKyIiqTVmojezTOBJYDWwGPi8mS0edtg3gDrn3FLgC8D3x1FXRERSKEiLfiVQ75zb65zrAp4D7hp2zGLgjwDOuV3AfDObHbCuiIikUJBEPw9oiHve6O+LtxX4NICZrQQuASoD1hURkRTKCnCMJdjnhj1/Avi+mdUB24AtQE/Aut4vMVsDrAGorq4OEJaIyKDu7m4aGxvp6OhIdygpFY1GqaysJBKJBK4TJNE3AlVxzyuBpvgDnHOtwAMAZmbAPn/LG6tu3Gs8BTwFEIvFEl4MRERG0tjYSGFhIfPnz8dLQ+HjnOPkyZM0NjayYMGCwPWCdN1sAhaa2QIzywbuBV6IP8DMSvwygC8Br/nJf8y6IiLJ0NHRQVlZWWiTPICZUVZWNu5PLWO26J1zPWb2MPASkAmsc87tMLOH/PK1wCLg52bWC+wEvjha3XFFKCISUJiTfL+JvMdA8+idcxucc1c45y5zzv2Nv2+tn+Rxzr3pnFvonLvKOfdp59zp0eqKiIRNc3MzP/rRj8Zd784776S5uTn5AcXRnbEiIkkwUqLv7e0dtd6GDRsoKSlJUVSeIIOxIiIyhscee4z333+fZcuWEYlEKCgooKKigrq6Onbu3MmnPvUpGhoa6Ojo4Gtf+xpr1qwBYP78+dTU1NDW1sbq1au5+eabeeONN5g3bx6/+c1vyM3NveDYlOhFJHS+/dsd7GxqTeprLp5bxF9/4uoRy5944gm2b99OXV0dr776Kh//+MfZvn37wOyYdevWMWPGDNrb27n++uv5zGc+Q1lZ2ZDX2LNnD88++yw/+clPuOeee3j++ee5//77Lzh2JXoRkRRYuXLlkCmQP/jBD1i/fj0ADQ0N7Nmz57xEv2DBApYtWwbAddddx/79+5MSixK9iITOaC3vyZKfnz/w86uvvsrLL7/Mm2++SV5eHrfcckvCKZI5OTkDP2dmZtLe3p6UWDQYKyKSBIWFhZw5cyZhWUtLC6WlpeTl5bFr1y7eeuutSY1NLXoRkSQoKyvjpptu4pprriE3N5fZs2cPlK1atYq1a9eydOlSrrzySm644YZJjc2cm3qrDcRiMVdTU5PuMERkGnn33XdZtGhRusOYFIneq5nVOudiiY5X142ISMgp0YuIhJwSvYhIyCnRi4iEnBK9iEjIKdGLiIScEr2ISBJMdJligO9973ucO3cuyRENUqIXEUmCqZzodWesiEgSxC9TfNttt1FeXs4vf/lLOjs7ufvuu/n2t7/N2bNnueeee2hsbKS3t5dvfvObHD16lKamJm699VZmzpzJK6+8kvTYlOhFJHx+9xgc2Zbc15yzBFY/MWJx/DLFGzdu5Fe/+hVvv/02zjk++clP8tprr3H8+HHmzp3Liy++CHhr4BQXF/Pd736XV155hZkzZyY3Zp+6bkREkmzjxo1s3LiR5cuXs2LFCnbt2sWePXtYsmQJL7/8Mo8++iivv/46xcXFkxKPWvQiEj6jtLwng3OOxx9/nC9/+cvnldXW1rJhwwYef/xxbr/9dr71rW+lPB616EVEkiB+meI77riDdevW0dbWBsChQ4c4duwYTU1N5OXlcf/99/PII4+wefPm8+qmglr0IiJJEL9M8erVq7nvvvv44Ac/CEBBQQG/+MUvqK+v5+tf/zoZGRlEIhF+/OMfA7BmzRpWr15NRUVFSgZjtUyxiISClinWMsUiIhctJXoRkZBTohcRCTklehEJjak45phsE3mPSvQiEgrRaJSTJ0+GOtk75zh58iTRaHRc9TS9UkRCobKyksbGRo4fP57uUFIqGo1SWVk5rjpK9CISCpFIhAULFqQ7jClJXTciIiGnRC8iEnJK9CIiIRco0ZvZKjPbbWb1ZvZYgvJiM/utmW01sx1m9kBc2X4z22ZmdWamdQ1ERCbZmIOxZpYJPAncBjQCm8zsBefczrjDvgLsdM59wsxmAbvN7B+cc11++a3OuRPJDl5ERMYWpEW/Eqh3zu31E/dzwF3DjnFAoZkZUACcAnqSGqmIiExIkEQ/D2iIe97o74v3Q2AR0ARsA77mnOvzyxyw0cxqzWzNSL/EzNaYWY2Z1YR9HqyIyGQKkugtwb7ht57dAdQBc4FlwA/NrMgvu8k5twJYDXzFzD6U6Jc4555yzsWcc7FZs2YFiV1ERAIIkugbgaq455V4Lfd4DwD/5Dz1wD7gKgDnXJP/eAxYj9cVJCIikyRIot8ELDSzBWaWDdwLvDDsmIPARwHMbDZwJbDXzPLNrNDfnw/cDmxPVvAiIjK2MWfdOOd6zOxh4CUgE1jnnNthZg/55WuB7wBPm9k2vK6eR51zJ8zsUmC9N0ZLFvCMc+73KXovIiKSgL5KUEQkBPRVgiIiFzElehGRkFOiFxEJOSV6EZGQU6IXEQk5JXoRkZBTohcRCTklehGRkFOiFxEJOSV6EZGQU6IXEQk5JXoRkZBTohcRCTklehGRkFOiFxEJOSV6EZGQU6IXEQk5JXoRkZBTohcRCTklehGRkFOiFxEJOSV6EZGQU6IXEQk5JXoRkZBTohcRCTklehGRkFOiFxEJOSV6EZGQU6IXEQk5JXoRkZBTohcRCTklehGRkAuU6M1slZntNrN6M3ssQXmxmf3WzLaa2Q4zeyBoXRERSa0xE72ZZQJPAquBxcDnzWzxsMO+Aux0zl0L3AL8LzPLDlhXRERSKCvAMSuBeufcXgAzew64C9gZd4wDCs3MgALgFNADfCBAXREJm6YtUPN/oe0YzFsBlTGYuwJyS9Id2UUpSKKfBzTEPW/ES+Dxfgi8ADQBhcDnnHN9ZhakLgBmtgZYA1BdXR0oeBGZQrrOwY5/gk0/g6bNEMmD4kp47/d4bUFg5hVQeT3Mu857LF8MmUHSkFyIIGfYEuxzw57fAdQBHwEuA/5gZq8HrOvtdO4p4CmAWCyW8BgRmYJO1EPNOqj7B+hohplXwuq/hWvvhWgxdLTAoc1wqAYaa+C9l7xjwbsYVCzzWvyVMS/5F81N57sJpSCJvhGointeiddyj/cA8IRzzgH1ZrYPuCpgXRGZbnp7YPcGqPkZ7H0VMrJg0Scg9kWYfzNYXBsvWgyX3eptAM7B6f1wqNZL/I2b4N/WwhtdXnnh3KGJv2IZZOdN8hsMlyCJfhOw0MwWAIeAe4H7hh1zEPgo8LqZzQauBPYCzQHqish00doEm38OtU/DmcNQVAkf+StY/gUonB3sNcxgxgJvW/JZb19PJxzZNpj4D9XAuy/4x2fC7MV+l4+f/MsuhwzNDg9qzETvnOsxs4eBl4BMYJ1zboeZPeSXrwW+AzxtZtvwumsedc6dAEhUNzVvRURSwjnY9y+w6aewawO4Xrj8Y/Dx78LC25PTx56VM9iK5yFv39kTXuI/5Cf/bb/yuogAcor9Qd7rvTrzYpBfduFxhJR5vS1TSywWczU1NekOQ+Ti1n4a6p7xkuvJesidAcvvh9gDMOPSyY+nrw9OvDfY199YA8d2gOvzyksXDE38c5ZAVvbkx5kmZlbrnIslKtNwt4gMdagWNq2D7c9DTztUroS7n4LFd0Ekmr64MjKg/CpvW36/t6+zDQ7XDbb8978O237plWXmQMXSobN8SqqHjh9cJNSiFxFvauT2573umcN1EMmHpffA9V/0WsbTScuhwX7+xhpoqvMuWAD5s/x+/tjg3P5oUVrDTRa16EUksRN7vHnvW5/xpkHOWgR3/k9Y+rnpmwCL53nb1Z/ynvd2w7GdXvJvrPUe3/udf7DBrKsGE/+8GJQvgozMdEWfEmrRi1xserth14ve1Mh9r0FGBBZ/0psaecmNF0fXRvtpf3pn7eBgb/tpryy7AOYuH0z8lTEonJPeeANQi15EvC6NzX8PtX8PbUeguAo+8k1Y8QUoKE93dJMrt9SbOXT5x7znzsGpvXGzfGrgjf8DfT1eeXHVYD9/ZQwqroVIbvriHyclepEw6+uDfa963TO7f+fNUFl4G8S+7z2GrItiwsyg7DJvu/Zz3r7uDjjyjt/l4yf/nb/2yjKyYPY1Q2f5lF02ZT8NKdGLhNG5U94yAzXrvJZqXhnc+FVvamTp/HRHNz1EolC10tv6tR0belPX1mdh00+8stzSwVb/vJg3zz9vRnpiHyZcib6n07vxQuRi5Jw/NfJn3uJiPR1QdQPc8g2vD17/Ny5cQTlcdae3AfT1wvHdcbN8auHVJxhY0qvs8qGzfGZfA5mRSQ87PIneOfjeUiiYBdU3wiUf9B6D3pYtMl11nfXvGv0ZHN7qDSYu+0uIPQhzrkl3dOGW4S/PMHsxXPcfvH2dZ7xlmvtn+bz/J3jnOa8sKzp0Ebd5MW+FzxR3+YRn1k1PJ/zr/4YDb3gnuPuct3/GpXGJ/4Pe8ynajyYyLsd3+1Mjn4POFii/Gq5/0JsamVOY7uikn3PQ0jDYz3/In9vf2+mVF8wZmvgvuWlC6/iMNusmPIk+Xm83HH4HDr4BB96Eg29C+ymvrGC2l/AvudF7nH21BqRk+ujpgl3/7PW9738dMrO9O1av/xJUfUCNmOmipwuObvenePqDvafe927oemTPhP4dL75EP1z/Ghnxib/F/z6UnCLvP0h/i3/uivTe5i2SSEujt2Lk5p9D21HvVv7Yg7Dsfq+7Uqa/c6eg+YA3h38CNI8+fo2M2IPevuYGL+EfeMN7/OMfvP2ZOd5oeX+rv2qlt562yGTr64O9f/K6Z977vdcFcMUd3o1Nl39Un0TDJm9GymbpXBwt+iDOnYpL/G9563309YBleN07GuCVyXLuFGz5hdc9c3qf93F++b+H6/4jlF6S7uhkilLXzUR0nfX6zfqTvwZ4JZWc8/7GNv0Mdqz3Buqqb/QWFVv0yYtquV2ZGHXdTER2Plz6YW+D8wd4d2+Aul94ZRrglYnqbINt/+hNjTyyDbILvSUJYg96U/ZEkkAt+onSAK9ciGO7vOS+9TnobIXZS7ypkUvugZyCdEcn05Ba9KmgAV4Zr54u73tQa9bBgT97UyOvvtubGll5vbr/JGXUok+lsyeh4a3BxH94qwZ4L0bNBwenRp497q010z81Ut9zKkmiwdipouusN+B24E2vy6exRgO8YdXXB+//0Rtc3fOSt++KVd7UyMs+MqE7H0VGo66bqSI7Hy69xdvAH+Dd6nf3aIA3FM6eGJwa2XwA8svh5v/iTY0sqUp3dHKRUot+Kgk8wHujd/ecBninBueg4W3v+1Z3/hp6u+CSm72pkVf9haZGyqRQi3660ADv9NJ5Bt75pdd6P7rduxhf94D3b1d+VbqjExmgFv10owHe9Du6058a+f+g6wzMWeq13pf8O697TiQNNBgbZhrgPZ9z3sWvt9vrRunr8R57u72tr3vo894uf1/36HV6OqD+T955zsyBaz7tTY2cd93FcV5lSlPXTZiNNMDbv2bPhQzwOje+RNjbBb39+4cfE6BOX9zrj3nM8NccdkyqzLgUbv8f3hd7TJGviRMZi1r0YTfWAG/+rGHJMy659vWkNraMLMiIeDcOZWZ5jxkR76vW+reB8v592V69+H0Jj4lMoM4Yx2REvK/jU+tdpiC16C9mow3wHnwTOlrGTooJE+dIdUZK2sNeNyOiueQik0SJ/mJUUuVtS+9JdyQiMgnUpBIRCTklehGRkFOiFxEJuUCJ3sxWmdluM6s3s8cSlH/dzOr8bbuZ9ZrZDL9sv5lt88s0lUZEZJKNORhrZpnAk8BtQCOwycxecM7t7D/GOfd3wN/5x38C+M/OuVNxL3Orc+5EUiMXEZFAgrToVwL1zrm9zrku4DngrlGO/zzwbDKCExGRCxck0c8DGuKeN/r7zmNmecAq4Pm43Q7YaGa1ZrZmpF9iZmvMrMbMao4fPx4gLBERCSJIok90G+BIt9N+AvjzsG6bm5xzK4DVwFfM7EOJKjrnnnLOxZxzsVmzZgUIS0REgghyw1QjEP+NCZVA0wjH3suwbhvnXJP/eMzM1uN1Bb022i+sra09YWYHAsSWyExgKo4HKK7xUVzjo7jGJ4xxXTJSwZhr3ZhZFvAe8FHgELAJuM85t2PYccXAPqDKOXfW35cPZDjnzvg//wH4786530/wjYzJzGpGWu8hnRTX+Ciu8VFc43OxxTVmi94512NmDwMvAZnAOufcDjN7yC9f6x96N7CxP8n7ZgPrzVsEKgt4JpVJXkREzhdorRvn3AZgw7B9a4c9fxp4eti+vcC1FxShiIhckDDeGftUugMYgeIaH8U1PoprfC6quKbkevQiIpI8YWzRi4hIHCV6EZGQmzaJPsDCaleZ2Ztm1mlmj4ynbhrjStmCbwHi+ksze8ff3jCza4PWTWNc6Txfd/kx1fl3cN8ctG4a40rpgoJB37eZXe8vdPjZ8dZNQ1zp/Bu7xcxabHCByG+N9z2NyDk35Te8aZ3vA5cC2cBWYPGwY8qB64G/AR4ZT910xOWX7Qdmpul83QiU+j+vBv5tipyvhHFNgfNVwOCY1lJg1xQ5XwnjSuX5Gs/79o/7E96svc9OhXM2UlxT4G/sFuCfJ/qeRtumS4t+zIXVnHPHnHObgO7x1k1TXKkUJK43nHOn/adv4d3xHKhumuJKpSBxtTn/fx2Qz+AyIOk+XyPFlWpB3/dX8da+OjaBupMdVypdyHu+4PM1XRJ94IXVklw31a8daMG3SYjri8DvJlh3suKCNJ8vM7vbzHYBLwIPjqduGuKC1J2vQLGZ2Ty8GymH3HMTpG6a4oL0/5/8oJltNbPfmdnV46w7ouny5eDjWVgtmXVT/do3OeeazKwc+IOZ7XLOjboOULLjMrNb8RJqf9/ulDhfCeKCNJ8v59x6vDu9PwR8B/hY0LppiAtSd76CxvY94FHnXK/ZkMPTfc5GigvS+ze2GbjEOddmZncCvwYWBqw7qunSoh/PwmrJrJvS13ZxC74B/Qu+TVpcZrYU+Clwl3Pu5HjqpiGutJ+vuDheAy4zs5njrTuJcaXyfAWNLQY8Z2b7gc8CPzKzTwWsm4640vo35pxrdc61+T9vACJJ+xtL9qBDKja8Tx57gQUMDkZcPcKx/42hg7GB605yXPlAYdzPbwCrJisuoBqoB26c6Hua5LjSfb4uZ3DQcwXeAn82Bc7XSHGl7HxN5O8Eb3mUz06k7iTGle6/sTlx/5YrgYPJ+htLyj/6ZGzAnXiraL4P/Fd/30PAQ3EnqRFoBZr9n4tGqpvuuPBG0Lf62440xPVT4DRQ5281o9VNd1xT4Hw96v/eOuBN4OYpcr4SxpXq8xUktmHHPs3Q2S1pO2cjxTUF/sYe9n/vVryJCDeOVnc8m5ZAEBEJuenSRy8iIhOkRC8iEnJK9CIiIadELyISckr0IiIhp0QvIhJySvQiIiH3/wFg0Z7HbMv7WQAAAABJRU5ErkJggg==\n", "text/plain": [ "

" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "X, y = make_dataset(1000, d = 10, p = 0.1)\n", "\n", "test_acc = []\n", "train_acc = []\n", "frac = [0.1, 0.2, 0.3, 0.4, 0.5]\n", "for f in frac:\n", " X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, train_size=f)\n", " clf = tree.DecisionTreeClassifier()\n", " clf.fit(X_train, y_train)\n", " train_acc.append(accuracy_score(y_train, clf.predict(X_train)))\n", " test_acc.append(accuracy_score(y_test, clf.predict(X_test)))\n", " \n", "plt.plot(frac, train_acc, label = 'train')\n", "plt.plot(frac, test_acc, label = 'test')\n", "plt.legend()" ] }, { "cell_type": "code", "execution_count": 123, "id": "0feb7c3f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "|--- feature_2 <= 0.50\n", "| |--- feature_1 <= 0.50\n", "| | |--- feature_0 <= 0.50\n", "| | | |--- feature_7 <= 0.50\n", "| | | | |--- feature_3 <= 0.50\n", "| | | | | |--- class: 0\n", "| | | | |--- feature_3 > 0.50\n", "| | | | | |--- feature_6 <= 0.50\n", "| | | | | | |--- feature_4 <= 0.50\n", "| | | | | | | |--- class: 0\n", "| | | | | | |--- feature_4 > 0.50\n", "| | | | | | | |--- feature_8 <= 0.50\n", "| | | | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | | | | |--- feature_9 > 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_8 > 0.50\n", "| | | | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | | | |--- feature_9 > 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | |--- feature_6 > 0.50\n", "| | | | | | |--- class: 0\n", "| | | |--- feature_7 > 0.50\n", "| | | | |--- feature_8 <= 0.50\n", "| | | | | |--- feature_4 <= 0.50\n", "| | | | | | |--- class: 0\n", "| | | | | |--- feature_4 > 0.50\n", "| | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | |--- class: 0\n", "| | | | | | |--- feature_5 > 0.50\n", "| | | | | | | |--- feature_3 <= 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_3 > 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | |--- feature_8 > 0.50\n", "| | | | | |--- feature_3 <= 0.50\n", "| | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | |--- class: 0\n", "| | | | | | |--- feature_5 > 0.50\n", "| | | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | | |--- feature_4 <= 0.50\n", "| | | | | | | | | |--- feature_6 <= 0.50\n", "| | | | | | | | | | |--- class: 0\n", "| | | | | | | | | |--- feature_6 > 0.50\n", "| | | | | | | | | | |--- class: 0\n", "| | | | | | | | |--- feature_4 > 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_9 > 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | | |--- feature_3 > 0.50\n", "| | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | |--- class: 0\n", "| | | | | | |--- feature_9 > 0.50\n", "| | | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | | |--- feature_4 <= 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | | | |--- feature_4 > 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | | | |--- feature_5 > 0.50\n", "| | | | | | | | |--- class: 0\n", "| | |--- feature_0 > 0.50\n", "| | | |--- feature_6 <= 0.50\n", "| | | | |--- class: 0\n", "| | | |--- feature_6 > 0.50\n", "| | | | |--- feature_4 <= 0.50\n", "| | | | | |--- feature_7 <= 0.50\n", "| | | | | | |--- feature_8 <= 0.50\n", "| | | | | | | |--- feature_3 <= 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_3 > 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | |--- feature_8 > 0.50\n", "| | | | | | | |--- class: 0\n", "| | | | | |--- feature_7 > 0.50\n", "| | | | | | |--- class: 0\n", "| | | | |--- feature_4 > 0.50\n", "| | | | | |--- feature_3 <= 0.50\n", "| | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | | | | |--- feature_9 > 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | |--- feature_5 > 0.50\n", "| | | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_9 > 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | |--- feature_3 > 0.50\n", "| | | | | | |--- class: 0\n", "| |--- feature_1 > 0.50\n", "| | |--- feature_0 <= 0.50\n", "| | | |--- feature_8 <= 0.50\n", "| | | | |--- feature_4 <= 0.50\n", "| | | | | |--- feature_6 <= 0.50\n", "| | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_5 > 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | | | |--- feature_9 > 0.50\n", "| | | | | | | |--- class: 0\n", "| | | | | |--- feature_6 > 0.50\n", "| | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | |--- class: 0\n", "| | | | | | |--- feature_9 > 0.50\n", "| | | | | | | |--- feature_3 <= 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_3 > 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | |--- feature_4 > 0.50\n", "| | | | | |--- feature_5 <= 0.50\n", "| | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | |--- class: 0\n", "| | | | | | |--- feature_7 > 0.50\n", "| | | | | | | |--- feature_3 <= 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_3 > 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | |--- feature_5 > 0.50\n", "| | | | | | |--- class: 0\n", "| | | |--- feature_8 > 0.50\n", "| | | | |--- feature_9 <= 0.50\n", "| | | | | |--- feature_3 <= 0.50\n", "| | | | | | |--- feature_4 <= 0.50\n", "| | | | | | | |--- class: 0\n", "| | | | | | |--- feature_4 > 0.50\n", "| | | | | | | |--- feature_6 <= 0.50\n", "| | | | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | | | | |--- class: 0\n", "| | | | | | | | | |--- feature_5 > 0.50\n", "| | | | | | | | | | |--- class: 0\n", "| | | | | | | | |--- feature_7 > 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_6 > 0.50\n", "| | | | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | | | |--- feature_7 > 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | |--- feature_3 > 0.50\n", "| | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | |--- feature_6 <= 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_6 > 0.50\n", "| | | | | | | | |--- feature_4 <= 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | | | | |--- feature_4 > 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | |--- feature_7 > 0.50\n", "| | | | | | | |--- feature_6 <= 0.50\n", "| | | | | | | | |--- feature_4 <= 0.50\n", "| | | | | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | | | | |--- class: 1\n", "| | | | | | | | | |--- feature_5 > 0.50\n", "| | | | | | | | | | |--- class: 0\n", "| | | | | | | | |--- feature_4 > 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_6 > 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | |--- feature_9 > 0.50\n", "| | | | | |--- feature_5 <= 0.50\n", "| | | | | | |--- feature_6 <= 0.50\n", "| | | | | | | |--- feature_4 <= 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_4 > 0.50\n", "| | | | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | | | |--- feature_7 > 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | |--- feature_6 > 0.50\n", "| | | | | | | |--- class: 1\n", "| | | | | |--- feature_5 > 0.50\n", "| | | | | | |--- feature_6 <= 0.50\n", "| | | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_7 > 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | |--- feature_6 > 0.50\n", "| | | | | | | |--- class: 0\n", "| | |--- feature_0 > 0.50\n", "| | | |--- feature_7 <= 0.50\n", "| | | | |--- feature_9 <= 0.50\n", "| | | | | |--- class: 1\n", "| | | | |--- feature_9 > 0.50\n", "| | | | | |--- feature_3 <= 0.50\n", "| | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | |--- class: 1\n", "| | | | | | |--- feature_5 > 0.50\n", "| | | | | | | |--- class: 0\n", "| | | | | |--- feature_3 > 0.50\n", "| | | | | | |--- class: 1\n", "| | | |--- feature_7 > 0.50\n", "| | | | |--- feature_4 <= 0.50\n", "| | | | | |--- feature_6 <= 0.50\n", "| | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | |--- class: 1\n", "| | | | | | |--- feature_9 > 0.50\n", "| | | | | | | |--- feature_3 <= 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | | | | |--- feature_3 > 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | |--- feature_6 > 0.50\n", "| | | | | | |--- feature_8 <= 0.50\n", "| | | | | | | |--- class: 1\n", "| | | | | | |--- feature_8 > 0.50\n", "| | | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | | | |--- feature_3 <= 0.50\n", "| | | | | | | | | | |--- class: 0\n", "| | | | | | | | | |--- feature_3 > 0.50\n", "| | | | | | | | | | |--- class: 1\n", "| | | | | | | | |--- feature_5 > 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | | | |--- feature_9 > 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | |--- feature_4 > 0.50\n", "| | | | | |--- feature_3 <= 0.50\n", "| | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | |--- feature_6 <= 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_6 > 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | | | |--- feature_9 > 0.50\n", "| | | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | | |--- feature_6 <= 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | | | | |--- feature_6 > 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | | | |--- feature_5 > 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | | |--- feature_3 > 0.50\n", "| | | | | | |--- class: 1\n", "|--- feature_2 > 0.50\n", "| |--- feature_3 <= 0.50\n", "| | |--- feature_0 <= 0.50\n", "| | | |--- feature_7 <= 0.50\n", "| | | | |--- class: 0\n", "| | | |--- feature_7 > 0.50\n", "| | | | |--- feature_8 <= 0.50\n", "| | | | | |--- class: 0\n", "| | | | |--- feature_8 > 0.50\n", "| | | | | |--- feature_4 <= 0.50\n", "| | | | | | |--- class: 0\n", "| | | | | |--- feature_4 > 0.50\n", "| | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | |--- class: 0\n", "| | | | | | |--- feature_5 > 0.50\n", "| | | | | | | |--- feature_6 <= 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | | | | |--- feature_6 > 0.50\n", "| | | | | | | | |--- feature_1 <= 0.50\n", "| | | | | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | | | | |--- class: 0\n", "| | | | | | | | | |--- feature_9 > 0.50\n", "| | | | | | | | | | |--- class: 1\n", "| | | | | | | | |--- feature_1 > 0.50\n", "| | | | | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | | | | |--- class: 1\n", "| | | | | | | | | |--- feature_9 > 0.50\n", "| | | | | | | | | | |--- class: 0\n", "| | |--- feature_0 > 0.50\n", "| | | |--- feature_1 <= 0.50\n", "| | | | |--- feature_4 <= 0.50\n", "| | | | | |--- class: 0\n", "| | | | |--- feature_4 > 0.50\n", "| | | | | |--- feature_6 <= 0.50\n", "| | | | | | |--- feature_9 <= 0.50\n", "| | | | | | | |--- class: 0\n", "| | | | | | |--- feature_9 > 0.50\n", "| | | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | | | | |--- feature_5 > 0.50\n", "| | | | | | | | |--- feature_8 <= 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | | | |--- feature_8 > 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | |--- feature_6 > 0.50\n", "| | | | | | |--- class: 0\n", "| | | |--- feature_1 > 0.50\n", "| | | | |--- feature_4 <= 0.50\n", "| | | | | |--- feature_5 <= 0.50\n", "| | | | | | |--- feature_8 <= 0.50\n", "| | | | | | | |--- class: 0\n", "| | | | | | |--- feature_8 > 0.50\n", "| | | | | | | |--- class: 1\n", "| | | | | |--- feature_5 > 0.50\n", "| | | | | | |--- feature_6 <= 0.50\n", "| | | | | | | |--- class: 1\n", "| | | | | | |--- feature_6 > 0.50\n", "| | | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | | |--- feature_8 <= 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | | | | |--- feature_8 > 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | | | |--- feature_7 > 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | |--- feature_4 > 0.50\n", "| | | | | |--- class: 1\n", "| |--- feature_3 > 0.50\n", "| | |--- feature_1 <= 0.50\n", "| | | |--- feature_8 <= 0.50\n", "| | | | |--- feature_0 <= 0.50\n", "| | | | | |--- feature_9 <= 0.50\n", "| | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | |--- class: 1\n", "| | | | | | |--- feature_7 > 0.50\n", "| | | | | | | |--- class: 1\n", "| | | | | |--- feature_9 > 0.50\n", "| | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | |--- class: 1\n", "| | | | | | |--- feature_5 > 0.50\n", "| | | | | | | |--- feature_6 <= 0.50\n", "| | | | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | | | |--- feature_7 > 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | | | |--- feature_6 > 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | |--- feature_0 > 0.50\n", "| | | | | |--- class: 1\n", "| | | |--- feature_8 > 0.50\n", "| | | | |--- feature_0 <= 0.50\n", "| | | | | |--- class: 1\n", "| | | | |--- feature_0 > 0.50\n", "| | | | | |--- feature_9 <= 0.50\n", "| | | | | | |--- class: 1\n", "| | | | | |--- feature_9 > 0.50\n", "| | | | | | |--- class: 0\n", "| | |--- feature_1 > 0.50\n", "| | | |--- feature_9 <= 0.50\n", "| | | | |--- feature_5 <= 0.50\n", "| | | | | |--- feature_4 <= 0.50\n", "| | | | | | |--- class: 1\n", "| | | | | |--- feature_4 > 0.50\n", "| | | | | | |--- feature_8 <= 0.50\n", "| | | | | | | |--- class: 1\n", "| | | | | | |--- feature_8 > 0.50\n", "| | | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_7 > 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | |--- feature_5 > 0.50\n", "| | | | | |--- feature_6 <= 0.50\n", "| | | | | | |--- feature_0 <= 0.50\n", "| | | | | | | |--- feature_8 <= 0.50\n", "| | | | | | | | |--- class: 0\n", "| | | | | | | |--- feature_8 > 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | | | |--- feature_0 > 0.50\n", "| | | | | | | |--- feature_8 <= 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | | | | |--- feature_8 > 0.50\n", "| | | | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | | | | |--- feature_7 > 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | |--- feature_6 > 0.50\n", "| | | | | | |--- feature_8 <= 0.50\n", "| | | | | | | |--- class: 1\n", "| | | | | | |--- feature_8 > 0.50\n", "| | | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | | |--- feature_0 <= 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | | | |--- feature_0 > 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | | | |--- feature_7 > 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | |--- feature_9 > 0.50\n", "| | | | |--- feature_0 <= 0.50\n", "| | | | | |--- class: 1\n", "| | | | |--- feature_0 > 0.50\n", "| | | | | |--- feature_6 <= 0.50\n", "| | | | | | |--- class: 1\n", "| | | | | |--- feature_6 > 0.50\n", "| | | | | | |--- feature_8 <= 0.50\n", "| | | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | | |--- class: 1\n", "| | | | | | | |--- feature_7 > 0.50\n", "| | | | | | | | |--- feature_5 <= 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | | | | |--- feature_5 > 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | |--- feature_8 > 0.50\n", "| | | | | | | |--- feature_4 <= 0.50\n", "| | | | | | | | |--- feature_7 <= 0.50\n", "| | | | | | | | | |--- class: 0\n", "| | | | | | | | |--- feature_7 > 0.50\n", "| | | | | | | | | |--- class: 1\n", "| | | | | | | |--- feature_4 > 0.50\n", "| | | | | | | | |--- class: 1\n", "\n" ] } ], "source": [ "print(export_text(clf))" ] }, { "cell_type": "code", "execution_count": null, "id": "339916bb", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 5 }